{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "How GPflow relates to TensorFlow: Tips & Tricks \n",
    "--\n",
    "\n",
    "GPflow is built on top of TensorFlow, so it is useful to have some understanding of how TensorFlow works. Particularly, TensorFlow's two-stage concept of first building a static compute graph and then executing it for specific input values passed in gives rise to a bunch of potential pitfalls. This notebook aims to help with the most common issues."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import gpflow\n",
    "import tensorflow as tf\n",
    "import numpy as np\n",
    "\n",
    "from gpflow import settings"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 1. Computation time increases when I create gpflow objects."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The example below illustrates a typical situation when computation time increases proportionally to the number of GPflow objects that are created."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "for n in range(2, 4):\n",
    "    kernel = gpflow.kernels.RBF(input_dim=1)  # This is a gpflow object with tf.Variables inside\n",
    "    x = np.random.randn(n, 1)  # gpflow expects rank-2 input matrices, even for D=1\n",
    "    kxx = kernel.K(x)  # This is a tensor!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Remember, we operate on a tensorflow graph!**\n",
    "\n",
    "Every time we create (build and compile) a new GPflow object, we continue to add more tensors to the graph and only change the reference to them, despite overwriting, in this case, the kernel variable.\n",
    "\n",
    "So, unnecessary expansion of the graph slows down your computation!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In examples below we are fixing the issue (imagine running this code snippet in `ipython` again and again):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "for n in range(2, 4):\n",
    "    gpflow.reset_default_graph_and_session()\n",
    "    kernel = gpflow.kernels.RBF(1)\n",
    "    x = np.random.randn(n, 1)\n",
    "    kxx = kernel.K(x)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here we were simply resetting the default graph and session using gpflow's `reset_default_graph_and_session()` function. In the next example we explicitly build new `tf.Graph()` and `tf.Session()` objects:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "for n in range(2, 4):\n",
    "    with tf.Graph().as_default() as graph:\n",
    "        with tf.Session(graph=graph).as_default():\n",
    "            kernel = gpflow.kernels.RBF(1)\n",
    "            x = np.random.randn(n, 1)\n",
    "            kxx = kernel.K(x)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In the [External Mean Function notebook](tailor/external-mean-function.ipynb) we show a more real-world example of this idea."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2. How can I reuse a model on different data?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "-2.8766930392437593\n"
     ]
    }
   ],
   "source": [
    "np.random.seed(1)\n",
    "x = np.random.randn(2, 1)\n",
    "y = np.random.randn(2, 1)\n",
    "kernel = gpflow.kernels.RBF(1)\n",
    "model = gpflow.models.GPR(x, y, kernel)\n",
    "print(model.compute_log_likelihood())\n",
    "\n",
    "x_new = np.random.randn(100, 1)\n",
    "y_new = np.random.randn(100, 1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can compute the loglikelihood of the model on different data. Note, we didn't change the original model!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "-140.83017563045192"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x_tensor = model.X.parameter_tensor\n",
    "y_tensor = model.Y.parameter_tensor\n",
    "model.compute_log_likelihood(feed_dict={x_tensor: x_new, y_tensor: y_new})  # we can still probe the model with the old data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can do the same by updating (permanently) the values of the dataholders."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "-140.83017563045192"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "model.X = x_new\n",
    "model.Y = y_new\n",
    "model.compute_log_likelihood()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 3. I would like to use external TensorFlow tensors and pass them to a GPflow model"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can pass tensorflow tensors for any non-trainable parameters of the gpflow objects like DataHolders."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "-196.46001677717464"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.random.seed(1)\n",
    "kernel = gpflow.kernels.RBF(1)\n",
    "likelihood = gpflow.likelihoods.Gaussian()\n",
    "\n",
    "x_tensor = tf.random_normal((100, 1), dtype=settings.float_type)\n",
    "y_tensor = tf.random_normal((100, 1), dtype=settings.float_type)\n",
    "z = np.random.randn(10, 1)\n",
    "\n",
    "model = gpflow.models.SVGP(x_tensor, y_tensor, kern=kernel, likelihood=likelihood, Z=z)\n",
    "model.compute_log_likelihood()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can also use TensorFlow variables for trainable objects:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "z = tf.Variable(np.random.randn(10, 1))\n",
    "model = gpflow.models.SVGP(x_tensor, y_tensor, kern=kernel, likelihood=likelihood, Z=z)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "But you have to initialize them manually, before interacting with a model:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "-193.12198293763578"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "session = gpflow.get_default_session()\n",
    "session.run(z.initializer)\n",
    "model.compute_log_likelihood()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 4. I would like to share parameters between GPflow objects"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Sometimes we want to impose a hard-coded structure to the model.\n",
    "\n",
    "For example, we have a multi-output model where some output dimensions share the same kernel and others don't.\n",
    "\n",
    "Unfortunately we cannot do this after the kernel object is compiled. We have to do it at build time and then manually compile the object."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "with gpflow.decors.defer_build():\n",
    "    kernels = [gpflow.kernels.RBF(1) for _ in range(3)]\n",
    "    mo_kernels = gpflow.multioutput.kernels.SeparateMixedMok(kernels, W=np.random.randn(3, 4))\n",
    "    mo_kernels.kernels[0].lengthscales = mo_kernels.kernels[1].lengthscales\n",
    "    mo_kernels.compile()\n",
    "\n",
    "assert mo_kernels.kernels[0].lengthscales is mo_kernels.kernels[1].lengthscales"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 5. Optimizing my model repeatedly slows down the computation time"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Bad practice:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
    "x = np.random.randn(100, 1)\n",
    "y = np.random.randn(100, 1)\n",
    "model = gpflow.models.GPR(x, y, kernel)\n",
    "\n",
    "optimizer = gpflow.training.AdamOptimizer()\n",
    "\n",
    "optimizer.minimize(model, maxiter=2)\n",
    "\n",
    "# Do something with the model\n",
    "\n",
    "optimizer.minimize(model, maxiter=2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The `minimize()` call creates a bunch of optimization tensors. Calling `minimize()` again causes the same issue discussed under question (1)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The correct way of doing it without polluting your graph:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "kernel = gpflow.kernels.RBF(1)\n",
    "x = np.random.randn(100, 1)\n",
    "y = np.random.randn(100, 1)\n",
    "model = gpflow.models.GPR(x, y, kernel)\n",
    "\n",
    "optimizer = gpflow.training.AdamOptimizer()\n",
    "optimizer_tensor = optimizer.make_optimize_tensor(model)\n",
    "session = gpflow.get_default_session()\n",
    "for _ in range(2):\n",
    "    session.run(optimizer_tensor)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Don't forget to **anchor** your model to the session after optimisation steps. Then you can continue working with your model.<br/>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [],
   "source": [
    "model.anchor(session)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, if you need to optimize it again, you can reuse the same optimizer tensor."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "for _ in range(2):\n",
    "    session.run(optimizer_tensor)\n",
    "\n",
    "model.anchor(session)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 6. When I try to read parameter values, I'm getting stale values."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [],
   "source": [
    "np.random.seed(1)\n",
    "x = np.random.randn(100, 1)\n",
    "y = np.random.randn(100, 1)\n",
    "\n",
    "kernel = gpflow.kernels.RBF(1)\n",
    "model = gpflow.models.GPR(x, y, kernel)\n",
    "optimizer = gpflow.training.AdamOptimizer()\n",
    "optimizer_tensor = optimizer.make_optimize_tensor(model)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The initial value before optimisation is"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array(1.)"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "model.kern.lengthscales.value"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's call one step of the optimization and check the new value of the parameter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array(1.)"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "gpflow.get_default_session().run(optimizer_tensor)\n",
    "model.kern.lengthscales.value"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "After optimization you would expect that the parameters were updated, but they weren't. The trick is that the `value` property returns a cached numpy value of a parameter."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can get the value of the optimized parameter via the `read_value()` method, specifying the correct `session`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "1.0006322362558255"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "model.kern.lengthscales.read_value(session)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Or you can `anchor(session)` your model to the session after the optimisation step. The `anchor()` updates parameters' cache. **The `anchor(session)` is significantly more time-consuming than `read_value(session)`, do not call it too often without necessity.**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array(1.00063224)"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "model.anchor(session)\n",
    "model.kern.lengthscales.value"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 7. How can I save/load a GPflow model?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [],
   "source": [
    "kernel = gpflow.kernels.RBF(1)\n",
    "x = np.random.randn(100, 1)\n",
    "y = np.random.randn(100, 1)\n",
    "model = gpflow.models.GPR(x, y, kernel)\n",
    "\n",
    "from pathlib import Path\n",
    "filename = \"/tmp/gpr.gpflow\"\n",
    "path = Path(filename)\n",
    "if path.exists():\n",
    "    path.unlink()\n",
    "saver = gpflow.saver.Saver()\n",
    "saver.save(filename, model)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can load the model back into a different graph:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [],
   "source": [
    "with tf.Graph().as_default() as graph, tf.Session().as_default():\n",
    "    model_copy = saver.load(filename)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Or you can load the model into the same session:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [],
   "source": [
    "ctx_for_loading = gpflow.saver.SaverContext(autocompile=False)\n",
    "model_copy = saver.load(filename, context=ctx_for_loading)\n",
    "model_copy.clear()\n",
    "model_copy.compile()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The difference between the former and the latter approach is in tensorflow name scopes which are used for naming variables. The former approach replicates the instance of the tensorflow objects (which already exist in the original graph), so we need to load it in a new graph.\n",
    "The latter uses different name scopes for the variables so we can dump the model in the same graph."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
